UJM at INEX 2008: Pre-impacting of Tags Weights

نویسندگان

  • Mathias Géry
  • Christine Largeron
  • Franck Thollard
چکیده

This paper addresses the impact of structure on terms weighting function in the context of focused Information Retrieval (IR). Our model considers a certain kind of structural information: tags that represent logical structure (title, section, paragraph, etc.) and tags related to formatting (bold, italic, center, etc.). We take into account the tags influence by estimating the probability that a tag distinguishes relevant terms. This weight is integrated in the terms weighting function. Experiments on a large collection during INEX 2008 IR competition showed improvements for focused retrieval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ENSM-SE and UJM at INEX 2010: Scoring with Proximity and Tag Weights

This paper presents our participation in the Relevant in Context task (ad-hoc track) during the 2010 INEX competition, and a posterior analysis. Two models presented in previous editions of INEX by the authors were merged for our 2010 participation. The first one is based on the proximity of the query terms in the documents [1] and the second one is based on learnt tag weights [2]. The results ...

متن کامل

UJM at INEX 2008 XML Mining Track

This paper reports our experiments carried out for the INEX XML Mining track, consisting in developing categorization (or classification) and clustering methods for XML documents. We represent XML documents as vectors of indexed terms. For our first participation, the purpose of our experiments is twofold: Firstly, our overall aim is to set up a categorization text only approach that can be use...

متن کامل

UJM at INEX 2009 XML Mining Track

This paper reports our experiments carried out for the INEX XML Mining track 2009, consisting in developing categorization methods for multi-labeled XML documents. We represent XML documents as vectors of indexed terms. The purpose of our experiments is twofold: firstly we aim to compare strategies that reduce the index size using an improved feature selection criteria CCD. Secondly, we compare...

متن کامل

CERIST at INEX 2015: Social Book Search Track

In this paper, we describe our participation in the INEX 2015 Social Book Search Suggestion Track (SBS). We have exploited in our experiments only the tags assigned by users to books provided from LibraryThing (LT). We have investigated the impact of the weight of each term of the topic in the retrieval model using two methods. In the first method, we have used the TF-IQF formula to assign a we...

متن کامل

Stability of INEX 2007 Evaluation Measures

a new domain in IR XML as standard document format in web & DL growth in XML information repositories increase in XML-IR systems Two aspects of XML-IR systems-content (text/image/music/video info)-structure (info about the tags)

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008